Module 01

Reserve the first level headings (#) for the start of a new Module. This will help to organize your portfolio in an intuitive fashion.
Note: Please edit this template to your heart’s content. This is meant to be the armature upon which you build your individual portfolio. You do not need to keep this instructive text in your final portfolio, although you do need to keep module and assignment names so we can identify what is what.

Module 01 portfolio check

The first of your second level headers (##) is to be used for the portfolio content checks. The Module 01 portfolio check has been built for you directly into this template, but will also be available as a stand-alone markdown document available on the MICB425 GitHub so that you know what is required in each module section in your portfolio. The completion status and comments will be filled in by the instructors during portfolio checks when your current portfolios are pulled from GitHub.

  • Installation check
    • Completion status:
    • Comments:
  • Portfolio repo setup
    • Completion status:
    • Comments:
  • RMarkdown Pretty PDF Challenge
    • Completion status:
    • Comments:
  • Evidence worksheet_01
    • Completion status:
    • Comments:
  • Evidence worksheet_02
    • Completion status:
    • Comments:
  • Evidence worksheet_03
    • Completion status:
    • Comments:
  • Problem Set_01
    • Completion status:
    • Comments:
  • Problem Set_02
    • Completion status:
    • Comments:
  • Writing assessment_01
    • Completion status:
    • Comments:
  • Additional Readings
    • Completion status:
    • Comments

Data science Friday

The remaining second level headers (##) are for separating data science Friday, regular course, and project content. In this module, you will only need to include data science Friday and regular course content; projects will come later in the course.

Installation check

Third level headers (###) should be used for links to assignments, evidence worksheets, problem sets, and readings, as seen here.

Use this space to include your installation screenshots.

Portfolio repo setup

Detail the code you used to create, initialize, and push your portfolio repo to GitHub. This will be helpful as you will need to repeat many of these steps to update your porfolio throughout the course.

git status - is to check is my local repo is updated with the master repo

git fetch then git pull - to pull the files from master repo to local repo

git add . - to place file to the staging message

git commit -m “Commit message” - to commit a file for addtion and to include a message along with it.

git push - to push the file to the master repo

RMarkdown pretty PDF challenge

The following assignment is an exercise for the reproduction of this .html document using the RStudio and RMarkdown tools we’ve shown you in class. Hopefully by the end of this, you won’t feel at all the way this poor PhD student does. We’re here to help, and when it comes to R, the internet is a really valuable resource. This open-source program has all kinds of tutorials online.

http://phdcomics.com/ Comic posted 1-17-2018

Challenge Goals

The goal of this R Markdown html challenge is to give you an opportunity to play with a bunch of different RMarkdown formatting. Consider it a chance to flex your RMarkdown muscles. Your goal is to write your own RMarkdown that rebuilds this html document as close to the original as possible. So, yes, this means you get to copy my irreverant tone exactly in your own Markdowns. It’s a little window into my psyche. Enjoy =)

hint: go to the PhD Comics website to see if you can find the image above
If you can’t find that exact image, just find a comparable image from the PhD Comics website and include it in your markdown

Here’s a header!

Let’s be honest, this header is a little arbitrary. But show me that you can reproduce headers with different levels please. This is a level 3 header, for your reference (you can most easily tell this from the table of contents).

Another header, now with maths

Perhaps you’re already really confused by the whole markdown thing. Maybe you’re so confused that you’ve forgotton how to add. Never fear! A calculator R is here:

1231521+1234155628098
## [1] 1.234157e+12

Table Time

Or maybe, after you’ve added those numbers, you feel like it’s about time for a table! I’m going to leave all the guts of the coding here so you can see how libraries (R packages) are loaded into R (more on that later). It’s not terribly pretty, but it hints at how R works and how you will use it in the future. The summary function used below is a nice data exploration function that you may use in the future.

library(knitr)
kable(summary(cars),caption="I made this table with kable in the knitr package library")
I made this table with kable in the knitr package library
speed dist
Min. : 4.0 Min. : 2.00
1st Qu.:12.0 1st Qu.: 26.00
Median :15.0 Median : 36.00
Mean :15.4 Mean : 42.98
3rd Qu.:19.0 3rd Qu.: 56.00
Max. :25.0 Max. :120.00

And now you’ve almost finished your first RMarkdown! Feeling excited? We are! In fact, we’re so excited that maybe we need a big finale eh? Here’s ours! Include a fun gif of your choice!

Data Science Exercise

Data Science Friday Assignment Jan 26

#Load Library
library("tidyverse")
## ── Attaching packages ───────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 2.2.1     ✔ purrr   0.2.4
## ✔ tibble  1.4.2     ✔ dplyr   0.7.4
## ✔ tidyr   0.8.0     ✔ stringr 1.2.0
## ✔ readr   1.1.1     ✔ forcats 0.2.0
## ── Conflicts ──────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
#Exercise 1
metadata = read.table(file="Saanich.metadata.txt", header=TRUE, row.names = 1, sep="\t", na.strings="NAN")
OTU = read.table(file="Saanich.OTU.txt", header=TRUE, row.names = 1, sep="\t", na.strings="NAN")

#Exercise 2

metadata %>% rownames_to_column('sample') %>% 
  filter(CH4_nM >=100, Temperature_C<=10) %>%
  column_to_rownames('sample') %>% 
  select(Depth_m,CH4_nM,Temperature_C)
##              Depth_m  CH4_nM Temperature_C
## SI072_S3_185     185 310.068         9.091
## SI072_S3_200     200 774.034         9.117
#Exercise 3

nM_to_uM_Metadata_coversion <-metadata %>% rownames_to_column('sample') %>% 
  select(matches("nM"), matches('sample')) %>% 
  mutate(N2O_uM = N2O_nM/1000, Std_N2O_uM = Std_N2O_nM/1000, CH4_uM = CH4_nM/1000, Std_CH4_uM = Std_CH4_nM/1000) %>% 
  column_to_rownames('sample')

#For Exercise 3: All variables that are in nM to μM. The output table titled: nM_to_uM_Metadata_coversion shows only the original nM and converted μμM variables.

Data Science Friday Assignment Feb 16

library("tidyverse")

source("https://bioconductor.org/biocLite.R")
## Bioconductor version 3.6 (BiocInstaller 1.28.0), ?biocLite for help
biocLite("phyloseq")
## BioC_mirror: https://bioconductor.org
## Using Bioconductor 3.6 (BiocInstaller 1.28.0), R 3.4.3 (2017-11-30).
## Installing package(s) 'phyloseq'
## 
## The downloaded binary packages are in
##  /var/folders/z3/h3tm9hss3bx6zbhrqjq0f5y80000gn/T//Rtmpz8BFGe/downloaded_packages
## Old packages: 'dbplyr', 'forcats', 'knitr', 'nlme', 'rlang', 'stringr'
library("phyloseq")

load("phyloseq_object.RData")

#Exercise 1

ggplot(metadata, aes(x=PO4_uM, y=Depth_m)) + 
  geom_point(color="purple", shape=17)

#Exercise 2
metadata %>% 
  mutate(Temperature_F= Temperature_C*9/5+32) %>% 
  ggplot() + geom_point(aes(x=Temperature_F, y=Depth_m))

#gglot with phyloseq
plot_bar(physeq, fill="Phylum")

physeq_percent = transform_sample_counts(physeq, function(x) 100 * x/sum(x))
plot_bar(physeq_percent, fill="Phylum")

plot_bar(physeq_percent, fill="Phylum") + 
  geom_bar(aes(fill=Phylum), stat="identity")

#Exercise 3
plot_bar(physeq_percent, fill="Phylum", title = "Phyla from 10 to 200 in Saanich Inlet") +
  geom_bar(aes(fill=Phylum), stat="identity") + 
  labs(x="Sample depth", y="Percent relative abundance")

#Faceting
plot_bar(physeq_percent, fill="Phylum") +
  geom_bar(aes(fill=Phylum), stat="identity") +
  facet_wrap(~Phylum)

plot_bar(physeq_percent, fill="Phylum") +
  geom_bar(aes(fill=Phylum), stat="identity") +
  facet_wrap(~Phylum, scales="free_y") +
  theme(legend.position="none")

#Exercise 4

plot_nutrients= metadata %>% 
  select(Depth_m, NH4_uM,NO2_uM, NO3_uM, O2_uM, PO4_uM, SiO2_uM) %>% 
  gather(Nutrients, Concentration, NH4_uM,NO2_uM, NO3_uM, O2_uM, PO4_uM, SiO2_uM)

ggplot(plot_nutrients, aes(x=Depth_m, y=Concentration)) +
  geom_point() + geom_line() +facet_wrap(~Nutrients, scales="free_y") +
  theme(legend.position = "none") 

Origins and Earth Systems

Evidence worksheet 01

The template for the first Evidence Worksheet has been included here. The first thing for any assignment should link(s) to any relevant literature (which should be included as full citations in a module references section below).

You can copy-paste in the answers you recorded when working through the evidence worksheet into this portfolio template.

As you include Evidence worksheets and Problem sets in the future, ensure that you delineate Questions/Learning Objectives/etc. by using headers that are 4th level and greater. This will still create header markings when you render (knit) the document, but will exclude these levels from the Table of Contents. That’s a good thing. You don’t’ want to clutter the Table of Contents too much.

Whitman et al 1998

Learning objectives

Describe the numerical abundance of microbial life in relation to ecology and biogeochemistry of Earth systems.

General questions

  • What were the main questions being asked?

    • What is the estimate number of prokaryotes on earth, epecifically in seawater, soil, and the sediment/soil subsurface?
    • How much carbon derived from prokaryotes from the total carbon on Earth?
  • What were the primary methodological approaches used?

    • Three largest habitats were used to estimate the total number and total carbon of prokaryotes on earth:
    1. For the aquatic environments, the volumes of oceanic water, freshwater/ saline lakes, polar regions and the corresponding average cellular densities were multiplied to calculate the number of cells for that region. For the polar region, in particular, the estimated number of prokaryotes by Delille & Rosiers and the mean area extent of seasonal ice were also used in the calculation.
    2. For the soil, the authors conducted detailed direct counts from a coniferous forest utisol as it was generally considered representative of forest soil. 
    3. For the subsurface, the first approach is based on the assumption of the percentage of the  average porosity of the terrestrial subsurface (3%) and the total pore space occupied by prokaryotes (0.016%). The other approach involved using the estimated of number of prokaryotes in various groundwater sites multiplied to total volume of ground water in the earth surface.
  • Summarize the main results or findings.

    • Based on oceanic, soil, and subsurface habitats, the estimated total number of prokaryotes is 4 to 6 x \(10^{30}\) cells
    • 350-550 pg C of the total amount of C on Earth are estimated for the prokaryotes
    • prokaryotic carbon pool is ~ 60 to 100% of the total carbon found in plants globally
    • Prokaryotes contain ~ 85 to 130 Pg of N & 9 to 14 Pg of P which 10-fold more than plants
    • Most prokaryotes found in ocean, soil, and oceanic & terrestrial subsrface habitats
      • 1.2 x \(10^{29}\) cells in open ocean
      • 2.6 x \(10^{29}\) cells in soil
      • 3.5 x \(10^{29}\) cells in oceanic subsurface
      • 0.25 to 2.5 x \(10^{30}\) cells in terrestrial subsurface
    • The average prokaryotic turnover times are:
      • 20 0m upper ocean: 6 to 25 days
      • Ocean below 200 m: 0.8 years
      • Soil: 2.5 years
      • Subsurface: 1.2 x \(10^{3}\) years
    • Cellular production rate is ~ 1.7 x \(10^{30}\) cells/year (highest in open ocean)
    • The abundance of prokaryotes offers an enormous capacity for genetic diversity
  • Do new questions arise from the results?

    • There were many assumptions made in this study. Thus, how accurate were the numbers?
    • The prokaryotes’ biomass is rich in nitrogen and phosphorus and in fact greater than of plants by an order of magnitude. This is an indicative of the significat role prokaryotes play in C, N, and P nutrient cycles, globally. Are there other global events or factors that prokayotes are involved in?
    • Microbes have high mutation rate and this could affect its turnover rates and consequently the cycles of nutrients such as C, N, and P. Up to what extent do prokaryotes play a role in the total metabolic potential of the earth’s ecosystems?
    • How diverse are the prokaryotes in each of habitats?
  • Were there any specific challenges or advantages in understanding the paper (e.g. did the authors provide sufficient background information to understand experimental logic, were methods explained adequately, were any specific assumptions made, were conclusions justified based on the evidence, were the figures or tables useful and easy to understand)?

    • There were a lot of number presented in the study and it would be really helpful if the authors provided more information regarding how they performed various calculations.
    • Using multiple estimated data also increases the errors associated with it, which decreases the statistical confidence with the results

Problem set 01

Learning objectives:

Describe the numerical abundance of microbial life in relation to the ecology and biogeochemistry of Earth systems.

Specific questions:

  • What are the primary prokaryotic habitats on Earth and how do they vary with respect to their capacity to support life? Provide a breakdown of total cell abundance for each primary habitat from the tables provided in the text.

    • 1.2 x \(10^{29}\) cells in open ocean
    • 2.6 x \(10^{29}\) cells in soil
    • 3.5 x \(10^{29}\) cells in oceanic subsurface
    • 0.25 to 2.5 x \(10^{30}\) cells in terrestrial subsurface
  • What is the estimated prokaryotic cell abundance in the upper 200 m of the ocean and what fraction of this biomass is represented by marine cyanobacterium including Prochlorococcus? What is the significance of this ratio with respect to carbon cycling in the ocean and the atmospheric composition of the Earth?

    • 3.6 x \(10^{28}\) cells in the upper 200 m of the ocean
    • 2.9 x \(10^{27}\) cells are autotrophs
    • 8% of the biomass in the upper 200m of the ocean is represented by marine autotrophs including cyanobacterium/Prochlorococcus
    • 8% of the autotrophs are responsible for the amount of carbon cycled through the Earth’s oceans, which ultimately support carbon availability for the rest of the 92% heterotrophs
  • What is the difference between an autotroph, heterotroph, and a lithotroph based on information provided in the text?

    • An autotroph uses inorganic chemicals (i.e. carbon dioxide) as carbon source, while heterotroph assimilate organic carbon sources (2). Autotrophs are also self-nourshing and capable of fixing inorganic carbon dioxide into biomass.
    • A lithotroph uses an inorganic chemical as electron source (2).
  • Based on information provided in the text and your knowledge of geography what is the deepest habitat capable of supporting prokaryotic life? What is the primary limiting factor at this depth?

    • The text provides that there are prokaryotes up to 4 km deep in the subsurface.
    • Taking into account the deepest point in the ocean which is the Mariana’s Trench (10.9km deep), life could potentially exist at 14.9km deep.
    • The limiting factor at these depths is the temperature. At 4km deep into the substrate the temperature is ~ 125 degrees celsius.
  • Based on information provided in the text your knowledge of geography what is the highest habitat capable of supporting prokaryotic life? What is the primary limiting factor at this height?

    • From the text, it was suggested that the atmosphere at 77 km is the upper bound from the highest point of the earth surface that is capable of supporting prokaryotic life. The highest habitat on Earth is Mount Everest which is approximately 8.8km above sea level. Some factors that can limit survival of airborne prokaryotes are nutrient availability, moisture (desiccant conditions) and UV radiation (3).
  • Based on estimates of prokaryotic habitat limitation, what is the vertical distance of the Earth’s biosphere measured in km?

    • From the top of Mount Everest (8.8km high) to the bottom of Mariana’s Trench (10.9km deep + 4km deeper into the sediment), there is about 24 km vertical distance where prokaryotes presumably can live.
  • How was annual cellular production of prokaryotes described in Table 7 column four determined? (Provide an example of the calculation)

    • population size * # of turn over per year (years) = cell per year

    • Example: Using the data for marine heterotrophs:3.6 x 10^{28} * 365 day /15 turnovers = 8.2 x 10^{29} cells/ year

  • What is the relationship between carbon content, carbon assimilation efficiency and turnover rates in the upper 200m of the ocean? Why does this vary with depth in the ocean and between terrestrial and marine habitats?

    • According to the text, the carbon efficiency is approximated to be 20%.
    • Assumption: 20 fg of C per prokaryotic cell which is is about 20^{-30} petagrams
    • Amount of carbon in marine heterotrophs (pg/cell) = 3.6 x 10^28 cells x 20^{-30} petagrams of C/cell = 0.72 petagrams/ cell
    • With 20% loss and 80% approx. retained, there is 4 x 0.72 = 2.88 petagrams of C/year for marine heterotrophs

    • 51 petagrams of C/ year * 85% of that is consumed in photic waters = 43 petagrams of C/ year is consumed
    • 43 petagrams C consumed /year / 2.88 petagrams C assimilated/year = 14.9 or 1 turnover every 24.5 days

    • The variation of carbon assimilation with depths are primarily due to the different carbon production and composition of microoganism found in that particular habitat.

  • How were the frequency numbers for four simultaneous mutations in shared genes determined for marine heterotrophs and marine autotrophs given an average mutation rate of 4 x 10-7 per DNA replication? (Provide an example of the calculation with units. Hint: cell and generation cancel out)

    • [4.7x10^{-7} mutations/generation]^{4}= 2.56 x 10^{-26} mutations/generation
    • generations per year in marine habitats? 3.6 x 10^{28} cells
    • 365 days/ 16 days = 22.8 turnovers/year

    • 3.6 x 10^{28} cells x 22.8 turnovers/year = 8.2 x 10{^29} cells/year

    • 8.2 x 10{^29} cells/year x 2.56 x 10^{-26} mutations/generation = 2.1x 10{^4} mutations/year

    • 2.1 x 104 mutations/year is about 0.4 mutations/hour, as stated in the paper (1).

    • With a fast turnover rate and a big population size, these numbers are possible with respect to microbial population

  • Given the large population size and high mutation rate of prokaryotic cells, what are the implications with respect to genetic diversity and adaptive potential? Are point mutations the only way in which microbial genomes diversify and adapt?

  • What relationships can be inferred between prokaryotic abundance, diversity, and metabolic potential based on the information provided in the text?

Evidence Worksheet_02 “Life and the Evolution of Earth’s Atmosphere”

Learning objectives:

Comment on the emergence of microbial life and the evolution of Earth systems

  • Indicate the key events in the evolution of Earth systems at each approximate moment in the time series. If times need to be adjusted or added to the timeline to fully account for the development of Earth systems, please do so.

    • 4.6 billion years ago
      • Solar system formed. Inner planets received water vapour and carbon. At this time, there was high carbon dioxide concetration, vapour pressure and temprature was at 500 degree Celcius.
    • 4.5 billion years ago
      • Moon was formed which allowed the earth to spin, tils, have day/night cycles and different seasons.
    • 4.4 billion years ago
      • Zircon (the oldest minirals) formed
      • Earth had decreased in temperature at 100 degree Celcius
    • Between 4.4 and 4.1 billion years ago
      • A meteor impact
    • 4.1 billion years ago
      • Evidence of life found in Zircons
    • 4.0 billion years ago
      • The oldest rock: Acastagneiss
      • There was evidence of plate subduction
      • Greenhouse carbon dioxide increased
      • Metorite bombardment halted and sea water chemistry stablized.
    • 3.8 billion years ago
      • Chemical fossils such as carbon isotypes found in rocks which provides another evidence for early life. Use of C-12 suggests possibiity for phosynthesis. However, non-photosysnthetic autotrophs can also produce C-12.
    • 3.5 billion years ago
      • Structural fossils such presence of stromatolites (bacterial aggregations) found in rocks.
      • Early methanogensis.
    • 3.0 billion years ago
      • A glaciation occured.
      • Early cyanobacteria, evidence for photosynthesis
      • Great oxidations event
      • life on land
    • 2.7 billion years ago
      • Emergence of prokaryotes
    • 2.2 billion years ago
      • rock recognized as red beds
      • Oxygen levels increased sharply
    • 2.1 billion years ago
    • 1.8 billion years ago
    • 1.3 billion years ago

    • 550,000 years ago
      • Cambrian explosion, expansion of multicellular evolution
      • Denovian explosion, emergence of woody land plants
      • Caboniferous period, presence of fish cephalopds, corals
      • Formation of Pange resulted in dry, harsh climate in Pangea’s interior and also there increased competition among species as they were being clusted in one giant land mass.
      • Permian extrintion, 95% of species gone
    • 400,000 billion years ago
      • Increased in oxygen levels again, resulted in rise of giants
      • Hence, rise of dinosaurs
      • Cretaceous-Tertiary extinction event, which resulted in nothing over 10kg of an organims existed.
      • Increased in mammal size, and diversification
      • Dramatic global warming
      • Ice age event
      • Grass started to dominate the forest
    • 200,000 years ago
      • Homosapiens first appeared
  • Describe the dominant physical and chemical characteristics of Earth systems at the following waypoints:

    • Hadean
      • With the temeprature of 500 degree Celcius and high levels of carbon diaxide and water vapous, the Earth is practically a molten object.
    • Archean
      • A glaciation occured.
      • Soon after the planet became brown and hazy due methanogenesis. Methane produce by methanogens help keep Earth warm. Otherwise, the Earth would have stayed frozen and perhaps no life would have existed.
    • Precambrian
      • Phosynthesis evolves, result ing some oxygen in the atmosphere.
    • Proterozoic
    • During early proterozoic, another glaciation event occured. Once again, the Earth system has “shut down”.
      • Oxygen and atmospheric methane = carbon dioxide. This caused a net decrease in greenhouse gas effects, making earth cold and leading to glaciation.
    • The oxygen also started oxidizing iron forming banded irons, as seen in sedimentary rock.

    • Phanerozoic
    • There was an increased oxygenation of the atmosphere
    • Plants started to evolve
    • Coal deposits from dead organisms caused by multiple extiction events were stored in sediments
    • Once again, glaciation occured at various periods

Problem set_02 “Microbial Engines”

Learning objectives:

Discuss the role of microbial diversity and formation of coupled metabolism in driving global biogeochemical cycles.

Specific Questions:

  • What are the primary geophysical and biogeochemical processes that create and sustain conditions for life on Earth? How do abiotic versus biotic processes vary with respect to matter and energy transformation and how are they interconnected?

  • Why is Earth’s redox state considered an emergent property?

  • How do reversible electron transfer reactions give rise to element and nutrient cycles at different ecological scales? What strategies do microbes use to overcome thermodynamic barriers to reversible electron flow?

  • Using information provided in the text, describe how the nitrogen cycle partitions between different redox “niches” and microbial groups. Is there a relationship between the nitrogen cycle and climate change?

  • What is the relationship between microbial diversity and metabolic diversity and how does this relate to the discovery of new protein families from microbial community genomes?

  • On what basis do the authors consider microbes the guardians of metabolism?

Module 01 references

Utilize this space to include a bibliography of any literature you want associated with this module. We recommend keeping this as the final header under each module.

An example for Whitman and Wiebe (1998) has been included below.

  1. Whitman WB, Coleman DC, and Wiebe WJ. 1998. Prokaryotes: The unseen majority. Proc Natl Acad Sci USA. 95(12):6578–6583. PMC33863

  2. Bigle, W. 2017 . MICB 401: Enviromental Microbiology Laboratory

  3. Budny, J. A. (2017). Book review: Aerobiology-The toxicology of airborne pathogens and toxins. Los Angeles, CA: SAGE Publications.10.1177/1091581816678191